Investigating Convergence of Restricted Boltzmann Machine Learning
نویسندگان
چکیده
Restricted Boltzmann Machines are increasingly popular tools for unsupervised learning. They are very general, can cope with missing data and are used to pretrain deep learning machines. RBMs learn a generative model of the data distribution. As exact gradient ascent on the data likelihood is infeasible, typically Markov Chain Monte Carlo approximations to the gradient such as Contrastive Divergence (CD) are used. Even though there are some theoretical insights into this algorithm, it is not guaranteed to converge. Recently it has been observed that after an initial increase in likelihood, the training degrades, if no additional regularization is used. The parameters for regularization however cannot be determined even for medium-sized RBMs. In this work, we investigate the learning behavior of training algorithms by varying minimal set of parameters and show that with relatively simple variants of CD, it is possible to obtain good results even without further regularization. Furthermore, we show that it is not necessary to tune many hyperparameters to obtain a good model – finding a suitable learning rate is sufficient. Fast learning, however, comes with a higher risk of divergence and therefore requires a stopping criterion. For this purpose, we investigate the commonly used Annealed Importance Sampling, an approximation to the true log likelihood of the data and find that it completely fails to discover divergence in certain cases.
منابع مشابه
A Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images
Nowadays, ground vehicle monitoring (GVM) is one of the areas of application in the intelligent traffic control system using image processing methods. In this context, the use of unmanned aerial vehicles based on thermal infrared (UAV-TIR) images is one of the optimal options for GVM due to the suitable spatial resolution, cost-effective and low volume of images. The methods that have been prop...
متن کاملStatistical mechanics of unsupervised feature learning in a restricted Boltzmann machine with binary synapses
Revealing hidden features in unlabeled data is called unsupervised feature learning, which plays an important role in pretraining a deep neural network. Here we provide a statistical mechanics analysis of the unsupervised learning in a restricted Boltzmann machine with binary synapses. A message passing equation to infer the hidden feature is derived, and furthermore, variants of this equation ...
متن کاملApplication of continuous restricted Boltzmann machine to detect multivariate anomalies from stream sediment geochemical data, Korit, East of Iran
Anomaly separation using stream sediment geochemical data has an essential role in regional exploration. Many different techniques have been proposed to distinguish anomalous from study area. In this research, a continuous restricted Boltzmann machine (CRBM), which is a generative stochastic artificial neural network, was used to recognize the mineral potential area in Korit 1:100000 sheet, loc...
متن کاملAdvances in Deep Learning
Deep neural networks have become increasingly more popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to the recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks hav...
متن کاملStochastic Difference of Convex Algorithm and its Application to Training Deep Boltzmann Machines
Difference of convex functions (DC) programming is an important approach to nonconvex optimization problems because these structures can be encountered in several fields. Effective optimization methods, called DC algorithms, have been developed in deterministic optimization literature. In machine learning, a lot of important learning problems such as the Boltzmann machines (BMs) can be formulat...
متن کامل